AITopics | language model

Collaborating Authors

language model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

The Feeling of Control Slipping Away

The Atlantic - TechnologyMay-30-2026, 11:00:00 GMT

AI is causing a crisis of agency. Back in the web-traffic-obsessed days of 2018, at a time of dawning awareness of how easily audiences online could be manipulated and spoofed by bots, the writer Max Read argued that the internet had crossed a threshold known as "the Inversion." Not only had bots proliferated across the internet; they had come to constitute it. In outnumbering humans, bots were also loosening everyone's grasp on the very reality of online experience. "What's gone from the internet, after all, isn't'truth,' but trust: the sense that the people and things we encounter are what they represent themselves to be," Read wrote.

artificial intelligence, information, internet, (10 more...)

The Atlantic - Technology

Industry: Information Technology (1.00)

Technology:

Information Technology > Communications > Networks (0.78)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.41)

Add feedback

Efficient Benchmarking Is Just Feature Selection and Multiple Regression

Bowyer, Sam, Locatelli, Acyr, Cao, Kris

arXiv.org Machine LearningMay-26-2026

Efficient benchmarking techniques aim to lower the computational cost of evaluating LLMs by predicting full benchmark scores using only a subset of a benchmark's questions. By reframing this problem as an instance of multiple regression with feature selection, we find that existing efficient benchmarking methods can be greatly improved by simply using kernel ridge regression at the prediction stage. Additionally, using an information-theoretic feature-selection algorithm called minimum redundancy maximum relevance (mRMR), we can further improve upon these methods by selecting question subsets that will be maximally useful for prediction. Except in very data-poor settings, these approaches consistently achieve smaller prediction errors (in both MAE and RMSE), and greater ranking correlation between predicted and true scores (in both Spearman $ρ$ and Kendall $τ$) across a range of benchmarks using both binary and continuous metrics. Furthermore, mRMR subsampling is much faster than competitor methods (which often involve fitting probabilistic models or running clustering algorithms), and is more likely to select the same questions under different random seeds or training data splits. Tutorial code can be found at https://github.com/sambowyer/mrmr_eval .

artificial intelligence, coreset, machine learning, (13 more...)

arXiv.org Machine Learning

2605.25773

Country:

Europe (1.00)
North America > United States > California (0.46)
North America > United States > Minnesota (0.28)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.36)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.34)

Add feedback

HawkesLLM: Semantic Uncertainty Propagation in Agentic Text Simulation

Deng, Zewei, Ye, Tinghan, Xie, Liyan

arXiv.org Machine LearningMay-25-2026

Agentic text-simulation systems write in sequence, with each item becoming possible context for later steps. That makes uncertainty path-dependent: an early ambiguity can affect later outputs. This paper studies this problem with HawkesLLM, a framework that separates temporal influence modeling from text generation. We represent the cascade as a network whose nodes are text-generating agents. A multivariate Hawkes process models how these nodes activate over time and which earlier node outputs should influence later prompts. A language model then writes each new event from the compact memory selected by this temporal model. We evaluate the framework on a held-out Global Database of Events, Language, and Tone (GDELT) news-cascade case study. The diagnostics track semantic alignment with local held-out references and separate local drift from global drift. In this setting, HawkesLLM improves late-stage semantic alignment under a compact prompt-memory budget.

artificial intelligence, hawkesllm, natural language, (15 more...)

arXiv.org Machine Learning

2605.23043

Country: North America > United States (0.29)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Communications (0.94)

Add feedback

Causal Bias Detection in Generative Artificial Intelligence

Plecko, Drago

arXiv.org Machine LearningMay-19-2026

Automated systems built on artificial intelligence (AI) are increasingly deployed across high-stakes domains, raising critical concerns about fairness and the perpetuation of demographic disparities that exist in the world. In this context, causal inference provides a principled framework for reasoning about fairness, as it links observed disparities to underlying mechanisms and aligns naturally with human intuition and legal notions of discrimination. Prior work on causal fairness primarily focuses on the standard machine learning setting, where a decision-maker constructs a single predictive mechanism $f_{\widehat Y}$ for an outcome variable $Y$, while inheriting the causal mechanisms of all other covariates from the real world. The generative AI setting, however, is markedly more complex: generative models can sample from arbitrary conditionals over any set of variables, implicitly constructing their own beliefs about all causal mechanisms rather than learning a single predictive function. This fundamental difference requires new developments in causal fairness methodology. We formalize the problem of causal fairness in generative AI and unify it with the standard ML setting under a common theoretical framework. We then derive new causal decomposition results that enable granular quantification of fairness impacts along both (a) different causal pathways and (b) the replacement of real-world mechanisms by the generative model's mechanisms. We establish identification conditions and introduce efficient estimators for causal quantities of interest, and demonstrate the value of our methodology by analyzing race and gender bias in large language models across different datasets.

machine learning, mechanism, natural language, (19 more...)

arXiv.org Machine Learning

2605.11365

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.91)

Add feedback

NeuroMAS: Multi-Agent Systems as Neural Networks with Joint Reinforcement Learning

Lu, Haoran, Fang, Luyang, Zhong, Wenxuan, Ma, Ping

arXiv.org Machine LearningMay-19-2026

Multi-agent language systems are often built as hand-designed workflows, where agents are assigned semantic roles and communication protocols are specified in advance. We propose NeuroMAS, a method that first treats a multi-agent language system as a trainable and scalable neural-network-like architecture with LLM agents as nodes and intermediate textual signals as edges. In NeuroMAS, agent nodes are role-free but structure-aware: the topology only determines how information can flow in general, while reinforcement learning training determines how nodes communicate, specialize, and coordinate. This formulation shifts multi-agent design from workflow engineering toward architecture design, where depth, width, connectivity, and growth protocol become scalable sources of capability. Further, we provide a theoretical perspective showing why such modular textual computation is more parameter-efficient when tasks admit hierarchical decompositions. Experiments show that NeuroMAS improves significantly over both inference-time and trained multi-agent baselines. We further find that organizational scaling is path-dependent: larger systems can be challenging to train from scratch, but become feasible when grown progressively from smaller trained systems. These results suggest that learned neural multi-agent systems are a promising scaling axis for LLMs.

artificial intelligence, machine learning, node, (18 more...)

arXiv.org Machine Learning

2605.16757

Country: North America > United States > Georgia > Clarke County > Athens (0.14)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Consolidation-Expansion Operator Mechanics:A Unified Framework for Adaptive Learning

Guha, Debashis

arXiv.org Machine LearningMay-14-2026

Every adaptive learning system must alternate between two operations: consolidating what it already knows and expanding into new evidence. We propose \emph{Consolidation-Expansion Operator Mechanics} (OpMech), a framework that makes this structure precise. The central object is the \emph{order-gap} $\Ogap(θ; e)$, the degree to which a consolidation operator~$Q$ and an expansion operator~$P_e$ fail to commute at a given knowledge state. Because the order-gap is computable from the system's own trajectory, it serves as a real-time control signal: large values indicate that the system is still sensitive to the ordering of consolidation and expansion; once the order-gap falls and stays small, further processing is unlikely to change the outcome. Three results give the signal precise meaning: the order-gap decays along convergent trajectories; a persistently large order-gap implies the system is far from its settled state; and an order-gap-based stopping rule terminates with provable guarantees in both noiseless and bounded-noise settings. The framework applies across five domains: bandits, reinforcement learning, stochastic optimization, continual learning, and recursive language models. We give conditions under which the order-gap reliably tracks convergence in three representative cases. We develop the recursive language model application in detail, showing how OpMech replaces heuristic stopping rules and fixed recursion budgets with principled, evidence-driven alternatives.

assumption 4, machine learning, reinforcement learning, (17 more...)

arXiv.org Machine Learning

2605.09968

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Learning Perturbations to Extrapolate Your LLM

Cen, Zetai, Gu, Chenfei, Zhu, Jin, Li, Ting, Chen, Yunxiao, Shi, Chengchun

arXiv.org Machine LearningMay-14-2026

Training large language models (LLMs) such as GPT-5 and Qwen-3 (Singh et al., 2025; Yang et al., 2025) on massive text corpora aims at capturing the underlying distribution of natural language. Yet, it remains challenging for the trained model to extrapolate to out-of-distribution or out-of-domain settings beyond the support of its training data. The literature has seen the development of various data perturbation techniques, such as synonym replacement, random insertion, deletion, and swap, that modify training instances into semantically similar variants to effectively expose LLMs to a broader range of inputs and improve their ability to generalize beyond the training data (Feng et al., 2019, 2020; Li et al., 2024; Cen et al., 2026). However, their approach remains grounded in the discrete, word-level augmentation procedures mentioned previously, which may restrict its adaptivity across diverse domains. While discrete perturbations are simple to use, they could be too coarse and hard to refine due to the complexity of natural language (Park et al., 2022; Li et al., 2023). Meanwhile, fixed perturbations apply the same transformations to the data regardless of the contexts, thus failing to generalize appropriately (Ismailov and Asanova, 2025).

large language model, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

2605.13284

Country:

Asia (0.93)
North America > United States (0.67)
Europe (0.67)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

0dc91de822b71c66a7f54fa121d8cbb9-Paper-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsMay-1-2026, 06:26:46 GMT

artificial intelligence, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America (0.93)
Asia > India (0.72)

Genre:

Questionnaire & Opinion Survey (0.68)
Overview (0.46)

Industry: Education > Educational Setting > Higher Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications (0.68)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.68)

Add feedback

RaLEs: a Benchmark for Radiology Language Evaluations

Neural Information Processing SystemsApr-30-2026, 04:43:15 GMT

The radiology report is the main form of communication between radiologists and other clinicians. Prior work in natural language processing in radiology reports has shown the value of developing methods tailored for individual tasks such as identifying reports with critical results or disease detection. Meanwhile, English and biomedical natural language understanding benchmarks such as the General Language Understanding and Evaluation as well as Biomedical Language Understanding and Reasoning Benchmark have motivated the development of models that can be easily adapted to address many tasks in those domains. Here, we characterize the radiology report as a distinct domain and introduce RaLEs, the Radiology Language Evaluations, as a benchmark for natural language understanding and generation in radiology. RaLEs is comprised of six natural language understanding and generation evaluations including the extraction of anatomical and disease entities and their relations, procedure selection, and report summarization. We characterize the performance of models designed for the general, biomedical, clinical and radiology domains across these tasks. We find that advances in the general and biomedical domains do not necessarily translate to radiology, and that certain more advanced models from the general domain can perform comparably to smaller clinical-specific models. The limited performance of existing pre-trained models on RaLEs highlights the opportunity to improve domain-specific self-supervised models for natural language processing in radiology. We propose RaLEs as a benchmark to promote and track the development of such domain-specific radiology language models.

artificial intelligence, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country: North America > United States > Minnesota (0.28)

Genre: